Chunking-based Question Type Identification for Multi-Sentence Queries

نویسندگان

  • Mineki Takechi
  • Takenobu Tokunaga
  • Yuji Matsumoto
چکیده

This paper describes a technique of question type identification for multi-sentence queries in open domain question-answering. Based on observations of queries in real question-answering services on the Web, we propose a method to decompose a multi-sentence query into question items and to identify their question types. The proposed method is an efficient sentence-chunking based technique by using a machine learning method, namely Conditional Random Fields. Our method can handle a multi-sentence query comprising multiple question items, as well as traditional single sentence queries in the same framework. Based on the evaluation results, we discuss possible enhancement to improve the accuracy and robustness.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Doctoral Dissertation Identification of Multi-Sentence Question Type and Extraction of Descriptive Answer in Open Domain Question-Answering

The state-of-art question-answering systems provide the answer to the user by directly extracting the exact answer from a huge amount of documents in large databases or Web pages. On the other hand, many online question-answering services, such as automatic answering services in call centers, Q&A web sites in the Internet, are in operation by restricting the domain of questions, and by utilizin...

متن کامل

Automatic Generation of Review Matrices as Multi-document Summarization of Scientific Papers

A synthesis matrix is a table that summarizes various aspects of multiple documents. In our work, we specifically examine a problem of automatically generating a synthesis matrix for scientific literature review. As described in this paper, we first formulate the task as multidocument summarization and question-answering tasks given a set of aspects of the review based on an investigation of sy...

متن کامل

Challenges in the Alignment, Management and Exploitation of Large and Richly Annotated Multi-Parallel Corpora

The availability of large multi-parallel corpora offers an enormous wealth of material to contrastive corpus linguists, translators and language learners, if we can exploit the data properly. Necessary preparation steps include sentence and word alignment across multiple languages. Additionally, linguistic annotation such as part-of-speech tagging, lemmatisation, chunking, and dependency parsin...

متن کامل

Discourse Chunking and its Application to Sentence Compression

In this paper we consider the problem of analysing sentence-level discourse structure. We introduce discourse chunking (i.e., the identification of intra-sentential nucleus and satellite spans) as an alternative to full-scale discourse parsing. Our experiments show that the proposed modelling approach yields results comparable to state-of-the-art while exploiting knowledge-lean features and sma...

متن کامل

A Hybrid Model for Phrase Chunking Employing Artificial Immunity System and Rule Based Methods

Natural language Understanding (NLU), an important field of Artificial Intelligence (AI) is concerned with the speech and language understanding between human and computer. Understanding language means knowing what concept a word or phrase stands for and how to link them to form meaningful sentence. Identification of phrases or phrase chunking is an important step in natural language understand...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993